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ABSTRACT 

The random error of radioastronomical measurements is usually computed in 
the weak-signal limit, which assumes that the system temperature is sensibly the 
same on and off source, or with and without a spectral line. This assumption is 
often very poor. We give examples of common situations in which it is important 
to distinguish the system noise in signal-bearing and signal-free regions. 

Subject headings: methods: data analysis; methods: statistics; instrumentation: 
miscellaneous 

1. Introduction 

Few experiments are performed without some attempt at estimating their errors, and 
the random errors of measurement in radio astronomy are typically determined in one 
general way. Some form of comparison is performed whereby samples are taken toward 
and away from a signal source, or with and without a spectral line. Subsequent analysis 
proceeds under the assumption that random errors everywhere in the dataset are as given 
by the statistical properties manifested in the signal-free regions. No attempt is made 
to measure the variances of signal-bearing and signal-free samples separately during the 
experiment, and, after the fact, random errors of measurement in signal-bearing samples 
are obscured because the form of the signal is arbitrary. Discussions of fitting and profile 
analysis invariably assume that measurement variances are the same with or without the 
signal, as for instance the Zeeman analysis of Marshall (1995) or the fitting of functions 
{e.g. Gaussians) by Kaper et al. (1966) or Rieu (1969). Textbook discussions contain no 
suggestion that system noise is influenced by the presence of a signal or that samples 
with different variances may be interleaved in the same datastream (Kraus 1986; Burke & 
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Graham-Smith 1997; Rohlfs et al. 2000). 

Yet, such treatment has been flawed for a surprisingly long time. 100 K H I lines 
have been routinely observed with sub- 100 K receiving systems for more than 30 years. 
Continuum sources whose antenna temperatures exceed the equivalent noise temperature 
of the receiving equipment have been observed even longer. The error of measurement in 
signal-bearing samples is often significantly different - with current receivers it could easily 
be a factor of 5 at the peak of a strong galactic H I emission line - but the difference has 
been ignored. 

Error estimates determine confidence levels and even data containing strong signals 
can be compromised by misunderstanding of their significance; for instance, when two very 
strong signals are differenced to detect a smaller one in H I emission-absorption experiments 
and searches for Zeeman splitting. Considering how slowly experimental errors typically 
improve with the amount of time invested in an experiment, it follows that changes in 
the acknowledged errors of an experiment are equivalent to much larger differences in 
the observing time required to reach them. A priori knowledge of errors is an important 
element in the design of experiments and these considerations may have a significant effect 
on the planning of an observing session. They should be implemented in the software which 
supports analysis. 

The purpose of this work is to illustrate a variety of common situations where random 
error is dominated by the presence of a signal. In the following section some basics of radio 
astronomy measurement are sketched. These are used to analyse the statistics of noise and 
the errors of component fitting when signals are present in emission and absorption. The 
final section is a brief summary with an even briefer mention of the extension of these 
notions to aperture synthesis. 
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2. Power, temperature and noise 
2.1. Basics 

A temperature scale is established whereby power is compared to the classical power 
spectral density kT (W Hz -1 ) in a resistor in thermal equilibrium at temperature T 
(Dicke 1946; Kraus 1986; Rohlfs et al. 2000). The output power level of the telescope 
system is then quoted as a 'system temperature', i.e., kT sys . The actual power density 
hv/(e hv ' kT - 1) + hu/2 (Callen & Welton 1951) reduces to kT only in the Rayleigh- Jeans 
limit and when zero-point fluctuations are ignored. 

In our simplified discussion we assert T sys = T T + Ta- T t represents everything 
which does not depend on any particular source or input signal and we assume that it is a 
constant or constant function of frequency u: possible dependencies of T T are suppressed 
for convenience of notation. Observing at a frequency v entails a minimum contribution of 
hv/k to T sys , which is included in Tt- 

T A represents a signal external to the telescope. The equivalent temperature of a signal 
is its 'antenna temperature' which by convention is related to the incident flux density 
(W m~ 2 Hz -1 ) as S u = 2 kT A /A. The effective area A is proportional to the geometric area 
of the telescope aperture. 

The signal may be confined in space or frequency, so we write T A = T A (v) where v 
is some combination of independent variables. In the presence of signal the power density 
is kT sys (v) = kT T + kT A (v) and the dependence of T A upon v makes T sys similarly 
dependent. If v represents the pointing of the telescope, added power comes and goes as the 
telescope moves. Alternatively, v may be velocity or frequency, and, as far as the receiver 
and square-law detector are concerned, the presence of signal at some v=v' is not manifested 
at v <> v'. The passband may be translated or inverted by mixing, but the receiver and 
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detector electronics are entirely linear in frequency. The spectrum is not jumbled nor is it 
appreciably smoothed until it is integrated and channelized in the so-called 'backend n . In 
Sect. 2.7 we discuss an exception to this linearity, namely, quantization noise in digital 
correlator spectrometers. 



Eventually a datastream is formed from samples of T sys , each of duration t (say) taken 
over a spectral width Au; this could be a spectrum, a continuum drift scan, etc. Associated 
with measurement of the output power kT sys there is a variance given by the Dicke (1946) 
or radiometer equation: 



The dimensionless quantity N = Aut is the product of the bandwidth measured in Hz 
and the integration time in seconds. Precise determination of the output power density kT 
within a band Av is done by averaging N independent samples. Within a band of width Av 
about some frequency v, the contained frequency components beat each other down to a 
frequency range O..Av so that all appear together summed within one channel of this width. 

Radiometer noise in the output datastream is the measurement variance of the power, 
independent of whether that power was contributed by T T or Ta- So the variance of the 
measured strength of an emission line, usually considered to be set only by Tt, actually 
increases in proportion to the source strength itself, weakly for weak signals and more 

x Even so, independence of adjacent 1 kHz slices of the spectrum, corresponding to 0.2 
kms -1 at the H I line, requires a minimum integration time of order only 1 msec 



2.2. Passband or system noise as a measurement variance 





(T T + T A (v)) 2 
N 



(1) 



N 
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strongly for very strong ones. 



2.3. Normalization and noise in real-world experiments 

As examples of the way that random error is affected by considerations of experimental 
design, we compare some common methods of data-taking. We consider that it is possible 
to take data "on"- or "off "-source; if the data are spectra, even the on-source data may 
have regions of the bandpass which are signal-free. 

In the simplest case where data are taken while staring at the source, the variance is 
given directly by Eqn. 1 



AT(vf (on) = (ll±-M0>!. (2o) 

^on 

When on- and off-source data are differenced the rms is 



AT(T) . (OT _ 0//)= C5^M)! + g (26) 

and the rms in signal-free regions is increased relative to that at the signal peak. 

In some cases a quotient is formed from on- and off-source data: the mean off-source 
power level is equated to a number, T T , and data appear in the form T T (on/off) or 
TT(on-off)/off. Both have the variance 



AT(vf (Wo//) = (Zi±^M>! + (Ii^Mi! (2c) 

^ on L\ Q ff 

so formation of the quotient increases the rms in the signal-bearing regions relative 
to the case where simple differencing is done, and everwhere relative to the pure "on" 
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spectrum. 

Because of such considerations, it is not possible to calculate the random error in 
signal-bearing regions, even given the empirical rms in the signal-free regions and the 
system properties which pertain to them, unless it is also understood how the data were 
taken. 

2.4. Emission line profiles 

One obvious example of the strong signal limit of a spectral line is galactic atomic 
hydrogen. Fig. 1 shows a typical low latitude galactic H I profile observed with a 25m 
telescope (Hartmann & Burton 1997) during the Leiden-Dwingeloo Sky Survey (LDSS). 
In the lower panel, the scale is expanded to show how the ±lcr noise envelope varies for 
data taken in the form (on-off)/off with T T = 36K, a typical value during the survey. The 
spectrum in Fig. 1 still has very high peak/rms signal-noise (465:1), but not nearly as 
good (1700:1) as implied by the 0.06 K rms over the baseline regions: the rms error of the 
integrated brightness is nearly twice as high as that estimated from the baseline rms level. 
H I is now commonly observed with T T = 15 — 25K. If the same profile were reobserved 
with T T = 18 K for one- fourth the amount of time (to reach the same baseline rms ), the 
line-generated rms error would be twice as high again. 

From the LDSS, we find that some 41% of the sky contains H I with a peak brightness 
T B > 20K, 33% has T B > 30K and 27% has T B > 40K (for 0°< I < 180°, 0°< b < 90°). 
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2.5. Profile fitting 

Discussions of profile fitting typically assume that the rms fluctuation is the same in 
every channel of a spectrum; to do otherwise would introduce imponderables and greatly 
hinder general understanding. However, datapoints having a higher rms should be accorded 
lower weight. 

We did a numerical experiment, fitting pure Gaussian profiles of fixed width 
(FWHM=AV)and varying strength T A (0), in the presence of noise which varies following 
Eqn. 1 (a pure "on" scan following the discussion of Sect. 2.3). We constructed spectra 
with 1 kms -1 channels at an assumed observing frequency of 1420.40575 MHz (the A21cm 
line), using T T = 20 K typical of modern H I receivers. We assumed an observing time of 30 
seconds, so that VN = 376.9 in Eqn. 1 or AT = 0.053 K when T A = 0. We then inserted 
gaussian lines having AV = 10 kms -1 and peak strengths Ta(0) = 2.5, 5.0, 10, 20, ... 160 
K, with the variance of the noise in accord with Eqn. 1. Ensembles of such spectra were 
generated for each value of T A (0) and fit to single Gaussians. The fitting was done twice 
for each spectrum, weighting by constant or (correctly) changing variance. 

The results of this experiment are reported in Fig. 2. The bottom curve in each panel 
is the rms of the fitted parameter given by analytic formulae, which coincides with the 
mean a posteriori error estimate returned by fitting software which assumes a constant 
profile rms. Stronger lines lead to linear improvements in fitting of the central velocity 
and width in this case, while the peak and profile integral fits are independent of strength; 
the fractional precision increases but fitting to the profile integral does not achieve higher 
precision than simply summing the channel values. 

The uppermost curve in each panel is the actual rms of the parameter determination 
with weighting by a constant variance. The shaded (middle) curve is the rms with proper 
weighting; in this case, the fitting software returns accurate error estimates. Several 
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phenomena are discernible in this diagram. There are irremediable increases in the variances 
of the fitted parameters relative to the case of constant profile rms. The precision of the 
fitted centroid and width improve only very slowly for strong signals, instead of linearly. 
Variances of the peak and integrated strengths increase in absolute terms as well. The 
fitting is only very slightly improved by correct weighting and the actual variances and the 
claimed error estimates diverge sharply if the behaviour of the noise is ignored. This could 
be misconstrued as implying that the stronger lines are less purely Gaussian. 

2.6. Sensitivity of absorption measurements 

Staring at a continuum source characterized by an antenna temperature Ta = Tc 
results in a system temperature T sys = Tt+ Tc- If the continuum is extinguished by a pure 
scatterer characterized by optical depth t(v), it follows that 

T sys (v) = T T + T c e-^ (4) 

AT(v) = I# = TT + Tcf*) (5) 

The system temperature is higher where there is no absorption. Eqn. 4 can 
be inverted to solve for the optical depth from the observed profile of T sys (v), i.e. 
t(v) = — In ((T sys (t>) — T T )/T C ). Neglecting other effects, the rms of the line /continuum 
ratio (the argument of the logarithm in this expression) is just ai/ c = AT/T C . ai/ c may be 
normally distributed but the logarithmic dependence of t{v) makes its error distribution 
noticeably asymmetric for moderate to large optical depth. Change in the derived optical 
depth for a given fluctuation <5// c in the line/ continuum ratio can be written 
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5r(v) T = r(v) + \n(e- T ^±5 l/c ) (6) 

where ± and =F convey the sense of the variations. Differentiation yields the rms of the 
derived optical depth 

Ar(v) = a l/C e^[ T + f ]. (7) 

J-T + J-C 

The usual analysis sets t(v) — on the right-hand side of Eqn. 5 so that the term in 
brackets in Eqn. 7 is unity. In Fig. 3 we plot Sr + /r vs. r for different Tc, taking 5i/ c = ai/ c 
in Eqn. 6 and assuming T T = 20 K, = 376.9 as before. In the usual analysis (upper 
panel) the fractional error in optical depth is minimized at r ~ 1 for all Tc and sensitivity 
appears to saturate at rather small Tc ~ T-p. However, use of Eqn. 7 shows that the 
sensitivity never saturates, in the sense that it is possible to achieve higher precision on 
ever-thicker lines (lower panel). Furthermore, the error in optical depth at r pa I is much 
smaller in the lower panel when Tc> Tt- 

Numerical experiments doing Gaussian fitting to absorption lines showed (as before) 
that proper weighting gives slightly improved parameter variances, and much-improved 
error estimates. Because the rms is higher in signal-free regions, naive error estimates 
returned by unwitting software are too large. Error in determining the continuum level 
of the baseline regions of an absorption spectrum increases with T c and may eventually 
become the limiting factor in determining the line/continuum ratio. 



2.7. Quantization noise 



Use of digital correlators represents a possible departure from the frequency-preserving 
character of the receiver and detector front-end, owing to the phenomenon of quantization 
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noise (Bennett 1948; Gwinn et al. 2000). Input to the correlator is bandpass filtered so 
that the sampling theorem may be applied to recovery of the data, but digitization of the 
continuously varying input power results in a representation of the signal which is very 
strongly not band-limited. That portion of the power spectrum lying outside the original 
band is returned, in varying degree depending on the sampling rate, as a form of noise. 
For Nyquist sampling (sampling at a rate twice the bandwidth) all is returned. For faster 
sampling the return is reduced as sampling sidebands beat with weaker, further-out portions 
of the quantization noise spectrum. As shown by Bennett (1948) for a 16-level system, 
quantization noise is steadily reduced until the sampling rate is 10 times Nyquist. 

Thus, sampling and quantization schemes scatter input power throughout the passband. 
Experiments using input thermal noise on systems with (many) more bits than are used in 
radio astronomy show that the quantization noise is essentially white (ibid) but the spectral 
characteristics of quantization noise are very much dependent on the form of the input. 
Very strong, highly confined signals can produce distortions of the outlying passband. 
Weaker signals will simply be dispersed with little effect on either the noise level or shape 
of the passband. 

Because of quantization noise, even the blackest absorption line will not reduce the 
rms to the level attained in the absence of all input signal. Eqn. 5, modified to account for 
quantization loss (1-e) in the case that strong absorption occupies a negligibly small fraction 
of the correlator bandpass (so that the quantization noise remains evenly distributed over 
the passband) is 

AT _ T T + T c (ee- T + (l-e)) 
ev/N 

Examples of quantization losses at Nyquist sampling rates are (1-e) = 0.36 (1-bit or 
2-level quantization), 0.12 (3- level) and 0.028 (9- level), so that minimum fractions (1 — e)/e 
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= 0.57, 0.14 or 0.03 of the rms corresponding to the input T c would unavoidably be 
present in each channel, including those at the bottom of the line. This complicates error 
analysis but the high efficiencies of modern correlators preserve at least some of the benefits 
discussed. Such considerations are another reason to prefer higher-level quantization and 
over-sampling schemes. 

3. Summary and extension to interferometry 

Radio astronomers frequently observe signals which are strong enough to dominate the 
random errors of their experiments. Unfortunately, it is not always possible to recognize 
the effects which are induced and they are neglected. Nonetheless, they have always been 
present in the data. 

This discussion points up obvious deficiencies in extant data reduction software and 
analysis techniques. Perhaps less obvious is the need not only for accurate calibration but 
also for reliable reporting on the part of the telescope systems. Measurement errors cannot 
be accurately assessed and accomodated in downstream data handling unless the system, 
continuum and line antenna temperatures are preserved, along with knowledge of the mode 
of data-taking. Synthesis instruments may be particularly difficult in this regard. Consider 
the use of the VLA (say) to detect H I absorption against a continuum source at low 
galactic latitude in the presence of an emission profile like that shown in Fig. 1. The VLA 
does not return the total power or singledish spectra, or, equivalently, the variation of T sys 
across the passband. The interferometer experiment per se can only succeed to the extent 
that foreground emission disappears; only its added noise contribution remains. 

We began the discussion by pointing out that the noise contributed from sky signals 
in single-dish observations occurs - ignoring sidelobes, quantization noise and the like 
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- at those places and/or frequencies where the sources themselves are located. It is an 
interesting endeavour to try to understand the extent to which source noise in interferometer 
experiments is similarly localized in the output datastream. For phased arrays it would 
seem possible to reproduce the single-dish mode. For synthesis arrays (Anantharamaiah 
et al. 1989; Crane & Napier 1989) the situation is much more complicated and uncertain 
even in the weak signal limit. 

The National Radio Astronomy Observatory is operated by AUI, Inc. under a 
cooperative agreement with the US National Science Foundation. I thank Darrel Emerson, 
Tony Kerr, Robert Lucas and A. R. (Dick) Thompson for helpful comments. Barry Clark 
pointed out the relevance of quantization noise and Fred Schwab provided the reference 
to Bennett (1948). This paper was put in final form while the author was enjoying the 
hospitality of the IAP in Paris. 
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Fig. 1. — Top: H I emission observed at (l,b) = (60°, +1.5°) with the 25m Dwingeloo telescope 
by Hartmann & Burton (1997). Bottom: expanded view of the ±l<x noise envelope assuming 
T T = 36 K. 
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Fig. 2. — Rms error of derived gaussian fitting parameters. Top left, central velocity. Bottom 
left, the FWHM, AV. Top right, the peak line strength. Bottom right, the profile integral. 
In each panel the uppermost curve is the empirically-determined rms and the bottommost 
curve the expected or reported rms, all for weighting by a constant profile variance. The 
middle curve is the parameter rms when weighting by the correct noise variance. For the 
other assumptions used to calculate these curves, see Sect. 2.5 
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Fig. 3. — Fractional rms error in optical depth when a nominal 20 K system is used to 
observe continuum sources of varying strengths T c , occulted by a pure scattering medium 
of optical depth r. These plots correspond roughly to 30 second integrations in 1 kms -1 
channels at 1420 MHz. At top, T sys is assumed independent of r; at bottom the dependence 
of T sys on r (Eqn. 7) is included. 



